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effects  of  syntactic  structure  were  clearest  in  Experiment  1  in  which 
listeners  categorized  meaningless  tonal  pattens.  Listeners  who 
categorized  a  syntactically  structured  target  set  performed  better 
than  those  with  an  unstructured  set.  Experiments  2  and  3  were 
similar  to  Experiment  1,  but  listeners  classified  patterns  of  familiar 
brief-duration,  complex  sounds  rather  than  tones.  When  listeners 
in  Experiment  3  were  given  explicit  descriptive  information  about  the 
pattern  components  in  their  instructions,  performance  actually 
improved  for  interpretable  patterns  but  was  slightly  degraded  for 
uninterpretable  patterns .^jtois  suggests  that  syntactic  and  semantic 
factors  Interact  in  an  important  way  to  Influence  performance.  It 
was  argued  that  many  compO-ek  nonspeech  patterns  have  both  syntactic 
and  semantic  structure  whicni  is  determined  by  the  sequence  of  source 
events  which  produce  them,  in  classifying  such  patterns,  as  in  the 
case  of  speech,  listeners  reAy  on  their  knowledge  of  these  factors 
as  well  as  on  the  perceptual  Information  in  the  sound  Itself. 
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Sequential  Structure  and  Context  in  the  Classification  of 
Nonspeech  Transient  Patterns 

Many  current  theorists  have  argued  that  the  recognition  of 
fluent  speech  involves  "both  top-down  or  knowledge  driven  and 
bottom-up  or  data  driven  processes  (Marslen-Wilson  ft  Welsh, 
1978;  Marslen-Wilson  &  Tyler,  1980;  Cole  ft  Jakimik,  1978, 
1980).  In  other  words,  when  human  listeners  perceive  speech 
they  appear  to  use  their  general  knowledge  of  linguistic 
structure  (both  syntactic  and  semantic)  as  well  as  the  specific 
perceptual  information  in  the  signal.  In  contrast,  relatively 
little  research  has  investigated  the  role  of  syntactic  and 
semantic  factors  in  the  perception  of  complex  nonspeech 
patterns.  Although  less  obvious  than  the  case  of  speech,  many 
of  the  complex  nonspeech  sounds  which  we  recognize  in  everyday 
life  have  a  specifiable  sequential  structure  (syntax)  as  well  as 
a  semantic  context.  The  importance  of  sound  effects  in  radio 
drama  confirms  this  point.  One  sequence  of  acoustic  transients 
is  heard  as  someone  opening  a  door  to  enter  a  room,  whereas 
another  can  depict  the  escape  of  bank  robbers  with  the  police  in 
hot  pursuit.  Ordered  sequences  of  nonspeech  transients  of  this 
sort  will  be  referred  to  as  transient  patterns.  These  patterns 
have  sequential  structure  since  their  individual  components 
occur  in  an  order  and  with  durations  determined  by  the  source 
events.  For  many  patterns,  an  experienced  listener  can  identify 
the  source  events  when  presented  with  only  the  sound  pattern. 
The  present  paper  investigates  the  role  of  syntactic  (i.e., 


Acoustic  Transients 


Page  2 


sequential  structure)  and  semantic  (i.e.,  contextual  knowledge 
of  the  source  events)  factors  in  transient  pattern  recognition. 

The  most  convincing  evidence  that  aural  perception  involves 
both  bottom-up  and  top-down  processing  is  found  in  the  speech 
perception  literature.  Logically,  the  "raw  data"  or  specific 
sounds  in  continuous  speech  cannot  be  sufficient  to  account  for 
language  understanding  since  the  "raw"  input  is  neither  complete 
nor  unambiguous  (Cole  &  Jakimik,  1978).  Rather,  the  listener 
must  rely  on  his  or  her  knowledge  of  the  syntax  and  semantics  of 
language,  and  the  constraints  introduced  by  those  elements  that 
can  be  interpreted  unambiguously.  It  is  not  uncommon  for  us  to 
"hear"  missing  words  or  to  correct  mispronounced  words  when  they 
occur  in  fluent  speech.  For  example,  Warren  (1970)  has 
demonstrated  a  "phoneme  restoration"  effect  in  the  perception  of 
spoken  text.  His  listeners  consistently  reported  hearing 
phonemes  that  had  actually  been  replaced  by  a  buzz  or  other 
nonspeech  sound.  Warren  argued  that  this  reflects  the  operation 
of  a  higher-level  process  that  perceptually  produces  the  missing 
phoneme. 

Similarly,  Marslen-Wilson  and  Welsh  (1978)  investigated  the 
tendency  for  listeners  to  correct  mispronounced  words  while 
shadowing  continuous  text.  They  concluded  that  although  speech 
perception  in  this  context  is  primarily  data-driven,  top-down 
processes  serve  to  make  the  system  more  resistant  to  input  noise 
and  to  enhance  overall  recognition  efficiency.  In  other  papers, 
Cole  and  Jakimik  (1978,  1980)  described  experiments  which 
investigated  a  variety  of  factors  in  word  recognition.  They 
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concluded  that  listeners  use  a  number  of  knowledge  sources  in 
speech  perception  ranging  from  fairly  specific  item-to-item 
syntactic  constraints  to  global  semantic  considerations  such  as 
the  theme  or  title  of  a  story.  In  conclusion  they  argued  "...it 
is  not  only  what  we  hear  that  tells  us  what  we  know;  what  we 
know  tells  us  what  we  hear”  (1978,  p.  113). 

The  evidence  from  the  speech  literature  that  top-down 
processes  play  a  major  role  in  perception  is  not  particularly 
surprising.  Less  obvious,  however,  is  the  evidence  reported  in 
recent  years  by  Bregman  and  his  associates  (Bregman,  1978)  which 
demonstrates  a  parallel  role  of  knowledge-driven  processes  in 
the  perception  of  relatively  simple  and  semantically 
impoverished  tonal  sequences. 

Bregman' s  basic  assumption  is  that  listeners  are  "built  to 
pay  attention  to  acoustic  sources,  not  to  acoustic 
components..."  (p.  74).  At  any  point  in  time  the  single 
waveform  we  hear  is  likely  to  represent  information  combined 
from  several  sources,  and  yet  we  perceive  sounds  from  each  of 
the  separate  sources  individually  (the  dog  barking,  a  Bach 
cantata,  the  telephone  ringing,  for  example)  rather  than  a 
nonsensical  hodgepodge.  Bregman  has  referred  to  this  perceptual 
phenomenon — the  act  of  sorting  our  perceived  acoustic  world  into 
separate  sources — as  auditory  streaming.  Prom  a  theoretical 
vantage  he  has  argued  that  streaming  occurs  as  the  result  of  a 
perceptual  parsing  of  the  complex  acoustic  input  (Bregman, 
1978),  and  his  research  has  focused  on  identifying  the  rules 
involved  in  the  formation  of  auditory  streams. 
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To  date  auditory  streaming  research  has  been  restricted 
largely  to  relatively  simple  stimulus  contexts.  For  example,  a 
listener  may  be  asked  to  judge  whether  one  or  two  melodic 
passages  are  heard  when  six  low-frequency  pure  tones  are  played 
alternately  with  six  high-frequency  pure  tones.  Two  separate 
streams  or  patterns  are  readily  heard  in  this  context,  and  the 
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The  results 

of 

these 

experiments  have  been 

consistent  with  similar  research  on  simple  visual  patterns  in 
revealing  general  heuristics  related  to  the  classical  Gestalt 
principles.  Such  things  as  similarity,  good  continuation, 
simplicity,  common  fate  and  closure  all  operate  to  influence  the 
auditory  streams  that  will  be  formed  in  a  specific  context. 

In  general,  Bregman  (1978)  has  concluded  that  two  kinds  of 
factors  are  important  in  auditory  streaming.  First,  there  are 
general — and  probably  innate — rules  or  heuristics  that  can  be 
applied  to  parse  all  signals.  Second,  there  are  more  specific 
rules  related  to  a  listener's  skills,  intentions,  and  knowledge 
of  the  stimuli  that  apply  to  only  selected  patterns.  By 
restricting  his  research  to  relatively  simple  acoustic  patterns, 
Bregman  has  necessarily  concentrated  on  the  former  of  these 
factors.  In  the  present  study  we  investigate  more  complex 
transient  patterns  and  consequently  extend  Bregman' s  analysis  to 
heuristics  of  the  second  type. 
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Our  approach  to  this  problem  involves  an  extension  of 
Reber's  "implicit  learning"  procedure.  In  his  research,  Reber 
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(Reber,  1969,  1976;  Reber  &  Allen,  1978;  Reber  ft  Lewis,  1977) 
examined  the  role  of  syntactic  structure  in  the  classification 
of  visually  presented  letter  patterns.  Listeners  classified 
either  grammatical  or  nongrammatical  patterns.  The  grammatical 
patterns  were  generated  by  a  simple  finite-state  grammar  similar 
to  the  one  shown  in  the  state-transition  diagram  of  Figure  1. 

An  additional  letter  in  the  pattern  string  is  produced  with 
every  legal  state-transition  made  between  the  initial  and 
terminal  states.  For  example,  the  letter  pattern  "AAACDD"  could 
be  produced  by  the  grammar  and  is  therefore  grammatical,  whereas 
the  pattern  "AADDCC"  would  be  ungrammatical  since  it  could  not 
be  produced  by  the  grammar. 

Reber' s  extensive  research  has  demonstrated  consistently 
superior  classification  performance  with  the  grammatical  as 
opposed  to  nongrammatical,  arbitrarily  grouped  patterns.  This 
has  been  shown  using  a  variety  of  tasks  and  dependent  variables. 
In  the  present  study,  three  experiments  were  conducted  to 
investigate  the  role  of  syntactic  structure  in  a  two-alternative 
transient  pattern  classification  task.  The  simple  finite-state 
grammar  of  Figure  1  was  used  to  generate  syntactically 
structured  patterns  for  all  three  experiments.  In  the  first 
experiment  listeners  classified  meaningless  patterns  of 
brief-duration  pure  tones.  The  second  experiment  was  similar  to 
the  first,  but  the  pattern  components  consisted  of  complex 
familiar  sounds  rather  than  simple  tones.  Although  the 
individual  components  were  familiar  in  this  experiment,  the 
transient  patterns  were  not  interpretable.  Finally,  the  role  of 
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both  syntactic  and  semantic  factors  was  investigated  in  the 
third  experiment  in  which  some  listeners  classified  semantically 
interpretable  patterns  of  complex  sounds. 

Experiment  1 

The  first  experiment  was  designed  to  demonstrate  that 
listeners  can  use  syntactic  information  to  facilitate  the 
classification  of  nonspeech  transient  patterns.  In  Experiment  1 
listeners  were  required  to  classify  sequences  of  brief-duration 
pure  tones  as  either  "target”  or  "noise"  patterns.  For  the 
Grammatical  group  the  target  patterns  were  generated  by  the 
finite-state  grammar  of  Figure  1 ,  whereas  for  the  other, 
Nongrammatical  group  the  targets  were  randomly  determined  but 
matched  to  the  grammatical  targets  in  length.  By  comparing 
performance  with  structured  (Grammatical  group)  and  unstructured 
(Nongrammatical  group)  target  patterns,  we  can  assess  the 
importance  of  syntactic  or  sequential  structure  in  the 
classification  of  simple  unfamiliar  tonal  patterns. 

Method  and  Procedure 

Participants .  Ten  student  volunteers  served  as  listeners 
in  the  experiment.  Five  were  assigned  to  each  of  the  two 
groups . 

Stimuli.  Individual  transient  events  consisted  of  five 
pure  tones  selected  to  be  approximately  equally  spaced  in  pitch 
(1157,  1250,  1345,  1442,  and  1542  Hz).  The  grammatical  patterns 
were  produced  by  assigning  one  of  the  five  tones  to  each  of  the 
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output  letters  shown  in  Figure  1  in  corresponding  ascending 
order.  Twelve  grammatical  patterns  ranging  in  length  from  four 
to  six  events  (three,  four,  and  five  patterns  of  each  length, 
respectively)  were  selected  to  make  up  the  "grammatical  target" 
category.  A  corresponding  "nongrammatical  target"  set  was 
produced  by  randomly  sampling  patterns  from  the  total  set  of 
possible  patterns  with  the  restriction  that  sampled  targets 
match  those  of  the  grammatical  target  patterns  in  length. 
Similarly,  48  randomly  constructed  "noise"  patterns  were 
selected  to  be  non-overlapping  with  the  target  sets,  but  to 
match  them  in  length.  Within  the  patterns  each  tone  was 
presented  for  80  msec  at  a  comfortable  listening  level  (87  dB 
SPL).  Successive  tones  were  separated  by  20  msec  of  silence. 

Apparatus.  All  experimental  events  were  controlled  by  a 
general-purpose  laboratory  computer.  The  tones  were  synthesized 
with  the  computer  using  standard  digital  techniques.  They  were 
output  on  a  12-bit  digital-to-analog  converter  at  a  sampling 
rate  of  12.5  kHz,  low-pass  filtered  at  5  kHz  (Khron-Hite  model 
3550),  attenuated,  and  presented  binaurally  over  matched 
Telephonies  TDH-49  headphones  with  MX-41 /AR  cushions.  Verbal 
prompts  were  presented  on  a  video  monitor  in  the  testing  booth, 
and  listeners  indicated  their  responses  by  pressing  buttons  on  a 
solid  state  keyboard. 

Procedure .  Listeners  were  tested  individually  in  a  sound 
attenuated  booth.  The  experiment  began  when  the  listeners  were 
instructed  that  they  would  be  hearing  patterns  made  up  of 
several  notes  played  very  quickly.  They  were  told  that  some  of 
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the  patterns  were  designated  as  targets  and  that  their  task 
would  be  to  pick  out  the  targets.  Although  listeners  were  told 
that  targets  and  nontargets  could  occur  equally  often,  no 
information  was  provided  regarding  the  composition  of  the  target 
set.  However,  they  were  told  that  the  pattern  categories  were 
determined  by  the  order  of  components  and  that  loudness  and 
duration  were  not  relevant  to  the  classification.  The 
Grammatical  and  Nongrammatical  groups  received  identical 
instructions . 

Each  trial  began  when  the  word  "LISTEN"  appeared  on  the 
listener's  screen.  A  second  prompt,  "TARGET  (Y  OR  N)?"  followed 
the  pattern  presentation.  The  listener  then  responded  by 
pressing  "Y"  or  "N"  on  the  keypad,  and  visual  feedback  was 
provided  immediately  after  the  response.  After  a  brief 
inter-trial-interval  (1.5  sec),  the  screen  was  erased  and  the 
next  trial  began.  Each  listener  received  96  trials  (four 
presentations  of  each  of  the  12  targets  and  48  presentations  of 
nontargets)  in  each  of  12  blocks.  Pattern  presentation  order 
was  randomized  within  blocks,  and  listeners  completed  four 
blocks  on  each  of  three  days.  Overall,  there  were  1152  trials 
per  individual. 

Immediately  after  the  last  block,  listeners  in  both  groups 
were  told  that  we  had  used  a  set  of  rules--like  the  rules  of 
language — to  construct  the  target  patterns.  We  explained  that 
they  would  be  hearing  a  new  set  of  patterns  and  that  their  task 
would  be  to  classify  each  pattern  as  target  or  nontarget,  "just 
as  you  can  tell  if  a  sentence  is  grammatically  correct  without 
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knowing  all  the  rules  for  sentences,  so  should  you  he  able  to 
tell  whether  any  sound  is  consistent  with  the  rules  we  used  by 
remembering  how  the  targets  sounded."  They  then  completed  an 
additional  block  of  96  trials  responding  as  before,  but  without 
feedback.  The  target  sounds  in  this  test  block  were  selected 
from  the  grammatical  patterns  produced  by  the  grammar  in  Figure 
1  which  were  not  used  as  targets  in  the  experiment.  This  test 
condition  was  included  to  determine  whether  the  listeners  in  the 
Grammatical  group  could  use  their  syntactic  knowledge  to 
classify  novel,  but  grammatical,  patterns.  Each  listener  was 
interviewed  before  leaving. 

Results  and  Discussion 

The  hit  (responding  "yes"  to  a  target)  and  false  alarm 
(responding  "yes"  to  a  nontarget)  rates  were  used  to  compute  a 
response  bias  free  (d' )  index  of  performance  for  each  individual 
on  each  block.  These  data  were  then  averaged  across  listeners 
within  each  of  the  two  groups  to  assess  group  performance. 
Nearly  perfect  performance  (hit  rate  of  .99,  false  alarm  rate  of 
.01)  would  yield  a  d'  of  4.64.  These  results  are  displayed  in 
Figure  2. 

Although  the  data  are  not  strictly  monotonic  over  blocks, 
it  is  clear  that  performance  improved  with  practice  for  both 
groups.  This  effect  was  revealed  statistically  by  a  significant 
Block  effect  in  a  two-way  (Group  by  Block)  mixed-design  analysis 
of  variance  with  repeated  measures  on  the  Block  factor 
( F(  1  1  ,88)  =  14.91 ,  £  <  -001  ). 
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It  is  also  clear  from  Figure  2  that  the  Grammatical  group 
reached  a  substantially  higher  overall  performance  level  than 
did  the  Nongrammatical  group.  This  finding  was  demonstrated  by 
a  significant  Block  by  Group  interaction  in  the  analysis  of 
variance  (F(11,88)  =  2.23,  £  <  .025).  The  main  effect  of  group 
was  only  marginally  significant  (F(1  ,8)  =  4.27,  £  <  .10) 
indicating  that  the  effect  of  grammaticality  developed  primarily 
with  practice.  Overall,  these  findings  are  consistent  with 
Reber's  earlier  results  with  letter  strings  (Reber,  1969)  and 
with  our  hypothesis  that  listeners  can  use  syntactic  structure 
to  help  them  classify  complex  nonspeech  patterns. 

Of  further  interest  is  the  performance  of  listeners  in  the 
Grammatical  group  with  unfamiliar  patterns  in  the  final 
post-test  block.  If  listeners  in  the  Grammatical  group  had 
internalized  the  syntactic  or  grammatical  structure  of  the 
target  patterns  during  the  experiment,  then  their  performance 
should  be  substantially  better  than  chance  on  the  test  block. 
On  the  other  hand,  listeners  in  the  Nongrammatical  group  would 
have  had  no  opportunity  to  learn  about  the  pattern  grammar  and 
consequently  their  performance  should  be  considerably  worse  than 
that  of  the  Grammatical  group.  A  d'  index  of  performance  was 
computed  for  each  of  the  ten  listeners  on  the  final  test  block. 
These  data  are  presented  in  Table  1 .  It  is  evident  from  these 
data  that  listeners  in  the  Grammatical  group  performed 
substantially  better  than  the  near  chance  levels  (i.e., 
d'  =  0.0)  of  listeners  in  the  Nongrammatical  group.  This 
difference  in  group  mean  performance  was  found  to  be 
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Table  1 

Pinal  Test  Block  Performance  (d')  for  All  Listeners 
in  Each  of  the  Three  Experiments 


Group 

Individual 

Listener 

1 

2 

3 

4  5 

Experiment 

1 

G 

-1.48 

1  .48 

1  .84 

2.58  3.31 

NG 

-.66 

-.43 

-.23 

.10  .56 

Experiment 

2 

G 

-.34 

.10 

.32 

1.76  3.79 

NG 

-.40 

-.12 

-.10 

.10  .20 

Experiment 

3 

G/S 

.95 

1.26 

1  .48 

2.35 

G/NS 

.86 

1.23 

1  .95 

2.55 

NG/S 

-1.20 

.00 

.00 

.34 

NG/NS 

-.88 

-.10 

.30 

1.12 

Mean 
1  .55 
-.13 

1.13 

-.06 

1.51 
1 .65 
-.21 
.11 
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statistically  reliable  (^(8)  =  1.98,  £  <  .05,  one-tailed).  This 
difference  occurred  despite  a  relatively  large  negative  d*  value 
observed  for  listener  1  in  the  Grammatical  group.  The  large 
absolute  magnitude  of  this  value  indicates  that  this  individual 
was  able  to  distinguish  the  grammatical  and  nongrammatical 
patterns  to  some  extent,  but  had  classified  the  targets  as 
nontargets  and  vice  versa.  Since  no  feedback  was  provided 
during  the  test  block,  a  response  reversal  of  this  sort  would 
not  be  detected  easily  by  the  listener.  Overall,  performance  on 
the  final  block  supports  our  position  that  listeners  in  the 
Grammatical  group  had  actually  learned  something  about  the 
syntactic  rules  used  to  generate  the  target  patterns  they  had 
classified  previously. 

Experiment  2 

Experiment  1  demonstrated  that  listeners  can  use  syntactic 
pattern  structure  to  their  advantage  in  classifying  simple  tonal 
transient  patterns.  The  question  arises  as  to  whether  a  similar 
result  would  occur  for  sound  patterns  made  up  of  complex, 
realistic  transients.  To  investigate  this,  a  second  experiment 
was  conducted  in  which  listeners  classified  patterns  as  either 
targets  or  nontargets  under  conditions  similar  to  those  of 
Experiment  1.  One  group  of  listeners  had  targets  produced  by 
the  grammar  of  Figure  1 ,  whereas  the  other  group  had  a  randomly 
constructed  target  set.  However,  unlike  the  previous 
experiment,  the  individual  transient  events  used  here  were 
familiar,  but  unrelated  real  world  sounds  recorded  in  the 
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laboratory. 

Method 

Participants .  Ten  student  volunteers  served  in  the 
experiment,  five  in  the  Grammatical  group,  and  five  in  the 
Nongrammatical  group.  None  had  served  in  the  previous 
experiment . 

Stimuli.  Five  individual  acoustic  transients  were  selected 
from  a  larger  set  of  common  ’’real  world"  sounds  collected  in  our 
laboratory.  The  larger  set  was  produced  by  recording  a  variety 
of  events  such  as  a  "clank"  (hammer  striking  a  heavy  metal 
object),  a  "thump"  (a  hollow,  resonant  sound  from  striking  a 
metal  drum),  and  other  similar  sounds.  These  samples  were  then 
digitized  using  standard  signal  processing  techniques  with  a 
10-bit  analog-to-digital  converter  at  a  12.5  kHz  sampling  rate. 
The  initial  82  msec  segment  of  each  sound  was  then  submitted  to 
a  physical  analysis  before  being  used  in  the  experiment.  A  name 
and  brief  description  for  each  transient  is  presented  in  Table 
2. 

The  12  grammatical  targets,  12  nongrammatical  targets,  and 
48  noise  patterns  were  generated  as  in  Experiment  1.  Each  sound 
was  presented  for  82  msec  at  a  comfortable  listening  level  that 
differed  for  each  sound.  Successive  transients  were  separated 
by  510  msec  of  silence  within  a  pattern. 

Apparatus .  Same  as  in  Experiment  1 . 

Procedure .  The  procedure  was  identical  to  that  of 
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Table  2 

Transient  Sounds  used  in  Experiments  2  and  3 


Experiment  2 


Drill 

82  msec  recording  of  a  high  speed  hand 
drill  being  turned  on 

Clap 

82  msec  recording  of  a  hand  clap 

Steam 

82  msec  recording  of  white  noise  bandpass 
filtered  between  4.6  kHz  and  5*4  kHz 

Clank 

82  msec  recording  of  a  hammer  striking  a 
C-clamp 

Wood 

82  msec  recording  of  two  pieces  of  wood 
being  struck  together 

Experiment  3 

Open  Valve 

320  msec  recording  of  a  radiator  valve 
being  turned 

Water  Drop 

38  msec  recording  of  a  drop  of  water 

Steam 

320  msec  recording  of  white  noise 
bandpass  filtered  between  4.6  kHz  and  5*4 
kHz 

Clang 

320  msec  recording  of  a  metal  object 
striking  a  radiator  pipe 

Water  Plush 


320  msec  recording  of  water  flushing  down 
a  drain 
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Results  and  Discussion 

Hit  and  false  alarm  rates  were  used  to  compute  a  response 
bias  free  (d')  index  of  performance  for  each  individual  on  each 
block.  These  data  were  then  averaged  across  listeners  within 
the  two  groups  to  assess  group  performance.  These  data  are 
presented  in  Figure  3.  Overall,  the  results  are  similar  to 
those  of  Experiment  1 .  Although  it  appears  that  performance  for 
the  Nongrammatical  group  may  have  reached  an  asymptote  by  block 
9,  performance  generally  improved  with  practice  for  both  groups 
as  indicated  by  a  significant  Block  effect  in  a  two-way  (Group 
by  Block)  mixed-design  analysis  of  variance  (F(11,88)  =  6.95, 
2  <  .001).  Furthermore,  the  Grammatical  group  did  better  than 
the  Nongrammatical  group  on  all  but  the  first  block.  As  in 
Experiment  1,  however,  the  performance  difference  between  the 
two  groups  developed  with  practice  since  the  Group  factor 
interacted  reliably  with  Block  (F(11,88)  =  2.82,  £  <  .005),  but 
did  not  produce  a  significant  main  effect  ( P( 1,8)  =  2.79, 
£  >  .10).  On  the  basis  of  these  findings  we  can  generalize  the 
conclusion  of  Experiment  1  to  include  patterns  of  familiar 
transients  as  well  as  patterns  of  simple  tones.  It  appears  that 
listeners  are  able  to  use  syntactic  or  grammatical  structure  to 
facilitate  classification  in  both  cases. 

However,  it  is  interesting  to  note  that  the  overall 
performance  level  reached  by  the  Grammatical  group  in  the 
present  experiment  (mean  d'  =  1.51)  was  considerably  lower  than 
that  reached  in  the  earlier,  tonal  pattern  experiment  (mean 
d'  =  2.06).  Furthermore,  this  difference  cannot  be  attributed 
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to  the  different  pattern  components  used  since  the  mean 
performance  levels  observed  for  the  Nongrammatical  groups  were 
virtually  identical  across  the  two  experiments  (mean  d'  =  .87, 
Experiment  1;  mean  d'  =  .88,  Experiment  2).  This  suggests  that 
structured  patterns  of  unrelated,  familiar  sounds  may  be  more 
difficult  for  listeners  to  classify  than  structured  patterns  of 
simple  tones. 

Although  the  present  experiment  was  designed  to  investigate 
the  role  of  syntactic  processes  in  pattern  classification,  the 
overall  difference  observed  between  Experiments  1  and  2  can  be 
explained  most  easily  by  referring  to  semantic  processing.  In 
the  present  experiment,  the  listeners  were  able  to  recognize  the 
individual  transients.  In  the  words  of  one  participant,  "These 
are  the  kinds  of  sounds  we  walk  around  all  day  trying  to 
ignore."  Since  the  sounds  were  familiar,  the  listener  could  not 
avoid  using  a  parsing  strategy  that  tried  to  make  "sense"  out  of 
the  patterns.  Since  the  finite-state  grammar  we  used  was 
semantically  arbitrary,  this  would  prove  to  be  an  impossible 
task.  On  the  other  hand,  Grammatical  listeners  in  Experiment  1 
had  little  difficulty  in  using  purely  syntactic  parsing  rules 
3ince  they  did  not  expect  the  tonal  transients  to  form 
semantically  sensible  patterns.  This  point  is  dramatized  by  one 
outspoken  listener  in  Experiment  2  who  offered  the  unsolicited 
advice  that  we  should  have  used  tones  instead  of  sounds  to  make 
the  task  easier! 

Finally,  the  results  obtained  for  the  final  test  block  with 
novel  grammatical  patterns  revealed  generally  poor  performance 
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for  all  listeners  with  only  two  individuals  in  the  Grammatical 
group  (listeners  4  and  5)  performing  better  than  chance.  The 
performance  level  observed  for  each  individual  is  shown  in  Table 
1 .  Although  the  group  mean  performance  was  somewhat  greater  for 
the  Grammatical  group  than  the  Nongrammatical  group,  this 
difference  did  not  approach  significance  (t(8)  =  .99).  This 
finding  is  consistent  with  the  above  discussion  in  suggesting 
that  the  listeners  have  a  great  deal  of  difficulty  in 
abstracting  syntactic  structure  from  patterns  of  familiar  sounds 
that  do  not  relate  to  any  interpretable  sequence  of  source 
events.  The  two  listeners  in  the  Grammatical  group  who  seemed 
able  to  do  this  may  have  ignored  the  distracting  semantic 
contents  of  the  individual  pattern  components  successfully. 

Experiment  3 

The  results  of  the  two  experiments  reported  above  suggest 
that  a  more  elaborate  investigation  of  both  syntactic  and 
semantic  factors  in  transient  classification  would  be 
appropriate.  In  Experiments  1  and  2  we  found  that  syntactically 
structured  patterns  were  easier  to  classify  than  unstructured 
patterns.  Listeners  in  the  present  experiment  were  also 
required  to  classify  either  grammatical  or  nongrammatical  target 
patterns,  but  the  procedure  was  extended  so  that  we  could 
evaluate  explicitly  the  semantic  component  in  aural 
classification. 

A  comparison  of  our  findings  from  Experiments  1  and  2 
suggested  that  the  semantic  cues  provided  by  individual 
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transient  components  may  be  distracting  when  the  overall  pattern 
does  not  lend  itself  to  an  interpretable  semantic  analysis.  In 
the  present  experiment,  some  listeners  were  required  to  classify 
transient  patterns  that  were  semantically  sensible.  To  include 
this  condition  it  is  essential  to  have  patterns  that  are  both 
syntactically  and  semantically  reasonable.  In  other  words,  the 
grammar  cannot  be  arbitrary,  but  must  reflect  the  temporal 
structure  of  possible  real-world  events.  Similarly,  the 
selection  of  individual  transient  events  must  be  consistent  with 
the  grammar. 


The  simple 

finite-state  grammar 

of 

Figure 

1  had 

been 

developed  with 

these  criteria  in 

mind 

The 

grammar 

can 

represent  possible  temporal  relations 

among 

a  series  of 

water 

and  steam  related  events  when  appropriate  complex  transients  are 
substituted  for  the  tones  and  sounds  used  in  the  preceding 
experiments.  The  five  sounds  employed  in  Experiment  3  are 
described  briefly  in  Table  2.  To  illustrate  how  semantically 
interpretable  patterns  can  be  produced,  consider  an  output 
string  A-A-A-C-D-D  from  the  grammar  in  Figure  1 .  This 
corresponds  to  a  pattern  that  could  represent  someone  taking 
three  turns  to  open  a  valve  which  releases  steam  which,  in  turn, 
causes  pipes  to  clang  twice.  Similar  source  scenarios  can  be 
provided  for  other  grammatical  patterns.  To  evaluate  the 
possible  role  of  semantic  context  in  aural  transient 
classification,  one  half  of  the  listeners  in  the  present  study 
were  read  a  brief  paragraph  that  suggested  a  schema  or  theme  for 
the  patterns  they  would  hear.  The  paragraph  was  suggestive,  but 
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did  not  identify  any  specific  patterns  explicitly: 

All  of  the  individual  sounds  relate  to  water  and 
steam.  You  will  hear  such  things  as  drips,  water 
flushing  down  a  drain,  a  valve  being  turned  on,  steam 
escaping,  and  radiator  pipes  clanging. 

The  remaining  listeners  received  no  semantic  information  about 
the  patterns.  The  role  of  semantic  context  in  transient 
classification  can  be  assessed  by  comparing  performance  across 
the  two  instructional  conditions. 

To  summarize,  four  groups  were  tested  in  the  present 
experiment.  The  groups  were  determined  by  factorially  combining 
the  two  syntactic  ( Grammatical  and  Nongrammatical )  and  two 
semantic  (Semantic  instructions  and  No  Semantic  instructions) 
variables.  The  Grammatical/Semantic  group  classified  structured 
target  patterns  and  received  the  semantic  information  described 
above,  whereas  the  Grammatical/No  Semantic  group  categorized  the 
same  structured  target  patterns  without  any  explicit  semantic 
instructions.  Two  corresponding  nongrammatical  target  groups 
were  tested  ( Nongrammatical/Semantic  and 
Nongrammatical /No  Semantic) .  The  possible  interaction  of  the 
syntactic  and  semantic  factors  was  of  particular  interest  here. 
Specifically,  the  findings  discussed  above  suggest  that  explicit 
semantic  instructions  may  induce  a  semantic  parsing  strategy 
which  would  facilitate  classification  for  interpretable  patterns 
(Grammatical  group) ,  but  interfere  with  performance  when  the 
patterns  were  not  interpretable  (Nongrammatical  group). 
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Method 

Participants .  Sixteen  student  volunteers  served  as 
listeners  in  the  experiment,  four  in  each  group.  No  listener 
had  served  in  either  of  the  previous  experiments. 

St imul i .  The  five  thematically  related  transient  sounds 
described  in  Table  2  were  recorded  and  digitized  as  in 
Experiment  2.  These  were  then  combined  to  form  sequential 
transient  patterns  as  in  the  earlier  experiments.  Twelve 
grammatical  target  patterns  were  produced  using  the  grammar  of 
Figure  1 .  The  unstructured  target  and  noise  patterns  were 
constructed  randomly  as  in  the  earlier  experiments.  Within  a 
pattern,  each  transient  was  presented  for  a  brief  duration  (32 
msec  for  the  drip  and  approximately  300  msec  for  all  others)  at 
a  comfortable  listening  level  that  differed  slightly  for  the 
various  sounds  to  enhance  realism.  Successive  sounds  were 
separated  by  510  msec  within  the  patterns. 

Apparatus .  Same  as  in  Experiments  1  and  2. 

Procedure .  The  procedure  was  identical  to  that  of 
Experiments  1  and  2.  All  four  groups  were  tested  for  12  blocks 
on  the  target  patterns  with  feedback  and  for  one  block  on  the 
test  patterns  without  feedback. 

Results  and  Discussion 

The  hit  and  false  alarm  rates  were  used  to  compute  a 
response  bias  free  (d')  index  of  performance  for  each  individual 
on  each  block.  These  data  were  then  collapsed  across 
individuals  within  each  group  to  determine  group  performance 
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levels.  These  mean  data  are  plotted  across  blocks  for  each  of 
the  four  groups  in  Figure  4.  This  finding  was  confirmed 
statistically  by  a  significant  main  effect  of  Block  in  a 
three-way  (Pattern  Structure  by  Semantic  Instructions  by  Block), 
mixed-design  analysis  of  variance  (F(1  1,132)  =  25.33,  £  <  .001). 
It  is  clear  from  Figure  4,  however,  that  large  differences  exist 
in  the  effects  of  practice  across  the  four  groups.  In 
particular,  the  two  Grammatical  groups  showed  considerable 
improvement  with  practice,  whereas  the  two  Nongrammatical  groups 
showed  relatively  little  improvement.  This  was  revealed  by  a 
statistically  significant  Pattern  Structure  by  Block  interaction 
(F(1 1,132)  =  9.96,  £  <  .001)  and  is  consistent  with  the  results 
of  Experiments  1  and  2.  No  interaction  was  observed  between  the 
Semantic  Instruction  and  Block  factors  (F(11,132)  <  1.0). 

Second,  listeners  in  the  two  Grammatical  groups  also 
performed  at  a  significantly  nigher  level  than  did  listeners  in 
the  two  Nongrammatical  groups.  This  was  supported  by  a 
significant  main  effect  of  Pattern  Structure  (F(1,12)  =  59-11, 
jo  <  .001).  This  finding  is  not  consistent  with  the  earlier 
experiments  in  which  the  effect  of  pattern  syntax  emerged 
largely  with  practice.  It  suggests  that  the  facilitative  effect 
of  sequential  structure  was  stronger  in  the  present  experiment 
than  in  the  earlier  studies. 

Third,  there  appears  to  be  no  overall  difference  between 
listeners  who  were  provided  with  a  semantic  context  and  those 
who  were  not.  The  main  effect  of  Semantic  Instruction  did  not 
approach  statistical  significance  (F(1,12)  <  1.0). 


TRIAL  BLOCK 


Figure  4.  Mean  performance  on  classification  of  "real 
world"  sounds  for  structured  (G)  and  unstructured  (NG)  patterns, 
with  (S)  and  without  (NS)  thematic  information  about  the  sounds. 
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Nevertheless,  it  is  obvious  that  the  semantic  instructions 
influenced  classification  performance.  Specifically,  semantic 
instructions  appeared  to  enhance  performance  for  those  listeners 
who  also  received  syntactically  structured  target  patterns 
(Gramraatical/Semantic  vs.  Grammatical/No  Semantic),  but  impair 
performance  for  those  who  classified  the  syntactically 
unstructured  targets  (Nongrammatical/No  Semantic  vs. 
Nongrammatical/Semantic) .  This  result  was  seen  in  a 
statistically  reliable  Pattern  Syntax  by  Semantic  Instruction 
interaction  ( F( 1,12)  =  4.73;  4-75  required  for  £  =  .05).  This 
effect  was  examined  further  in  a  post-hoc  analysis.  Lindquist's 
test  of  critical  differences  revealed  that  the 
Grammatical/Semantic  group  performed  significantly  better  than 
the  Grammatical/No  Semantic  group  (observed  difference  of  .51; 
critical  difference  of  .51  for  £  =  .05).  However,  the  apparent 
difference  between  the  Nongrammatical/Semantic  and 
Nongrammatical/No  Semantic  groups  did  not  reach  statistical 
significance  (observed  difference  of  .27;  critical  difference 
of  .51  f or  £  =  . 05 ) • 

The  above  finding  clearly  demonstrates  the  importance  of 
semantic  context  in  aural  transient  classification.  In 
addition,  it  underscores  our  earlier  conclusion  that  these 
effects  are  not  always  facilitative.  In  particular,  semantic 
cues — in  our  case  the  explicit  semantic  description  of  the 
sounds — appear  to  induce  semantic  parsing  strategies  that  can 
enhance  performance  only  for  semantically  interpretable  patterns 
(Grammatical  groups).  Listeners  in  the  Grammatical/Semantic 
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group  were  able  to  use  the  semantic  information  we  provided  to 
their  advantage.  Here,  a  semantic  parsing  strategy  was 
appropriate  in  that  it  could  lead  to  sensible  interpretations 
for  the  target  patterns.  These  listeners  performed  reliably 
better  than  their  counterparts  who  had  to  depend  on  the 
syntactic  structure  of  the  patterns  and  the  implicit  semantic 
cues  in  the  isolated  transients  alone. 

On  the  other  hand,  the  semantic  instructions  led  to  a 
slight,  albeit  nonsignificant,  impairment  of  performance  when 
semantically  anomalous  patterns  were  used  (Nongrammatical 
groups).  For  these  individuals,  the  specific  thematic 
instructions  inappropriately  led  them  to  search  for  sensible 
interpretations  of  the  patterns  when  none  existed. 

Although  the  explicit  thematic  instructions  were  an  obvious 
source  of  semantic  information  in  the  present  study,  it  is  also 
obvious  that  the  familiar  sounds  themselves  provided  an 
additional  source  of  semantic  cues.  Since  these  cues  were 
available  for  listeners  in  all  four  groups,  it  was  not  possible 
to  assess  their  effects  explicitly  in  the  present  study. 

Finally,  the  final  test  block  performance  was  examined  to 
determine  whether  any  listeners  were  able  to  generalize  their 
knowledge  of  the  target  set  to  new,  grammatical  test  patterns. 
The  d'  performance  levels  on  the  final  test  block  are  shown  for 
each  individual  in  Table  1.  As  expected,  listeners  in  the  two 
Nongrammatical  groups  responded  at  approximately  chance  levels, 
whereas  listeners  in  both  Grammatical  groups  responded  at  above 
chance  levels.  Overall,  listeners  in  the  two  Grammatical  groups 
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(mean  d'  =  1.58)  performed  reliably  better  on  the  test  patterns 
than  did  listeners  in  the  two  Nongrammatical  groups  (mean 
d'  =  -.05)  (t(14)  =  2.40,  £  <  .025,  one-tailed).  Those  who 
received  semantic  instructions  performed  slightly  worse  than 
those  with  no  explicit  semantic  instructions,  but  this 
difference  was  not  statistically  reliable  (t(6)  =  .21).  These 
findings  indicate  that  listeners  in  the  Grammatical  groups  were 
able  to  internalize  aspects  of  the  pattern  grammar  regardless  of 
whether  explicit  semantic  instructions  were  provided. 

General  Discussion 

Overall,  the  results  presented  above  have  demonstrated  that 
both  syntactic  and  semantic  factors  can  play  an  important  role 
in  the  classification  of  acoustic  transient  patterns.  Although 
pattern  syntax  influenced  performance  in  all  three  experiments, 
the  effects  of  syntactic  structure  were  most  clearly  seen  in 
Experiment  1  in  which  listeners  categorized  meaningless  tonal 
patterns.  Here,  listeners  who  categorized  a  grammatically 
structured  target  set  performed  substantially  better  than  those 
with  an  unstructured  set.  Listeners  in  the  former  group  were 
also  able  to  generalize  their  knowledge  of  the  grammar  to  a 
novel  set  of  grammatical  test  patterns.  These  results  are 
consistent  with  Reber's  earlier  findings  with 
visually-presented,  meaningless  letter  strings. 

Reber  has  argued  that  listeners  exposed  to  structured 
stimuli  internalize  a  "conceptual  structure”  which  represents 
the  underlying  grammar  or  rules,  and  that  this  abstraction 
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process  occurs  implicitly  rather  than  explicitly.  In  this  sense 
he  has  argued  that  the  learning  of  synthetic  grammars  of  the 
sort  employed  in  the  present  study  is  similar  to  the  acquisition 
of  natural  language  grammars  (Reber  &  Allen,  1978).  Our 
findings  are  generally  consistent  with  this  interpretation  since 
in  the  post-experimental  interview,  listeners  found  it 
impossible  to  articulate  the  rules  they  used  to  identify  the 
target  patterns.  Although  a  few  listeners  were  able  to  indicate 
some  obvious  properties  of  the  grammatical  patterns  (e.g.,  the 
fact  that  they  began  with  one  of  two  sounds) ,  most  specified 
such  vague  classification  rules  as:  the  targets  were  "more 
coherent"  or  "flowed  better"  and  the  nontargets  were 
"unexplainably  different"  or  "not  harmonious."  Regardless  of  how 
the  pattern  structure  is  internalized,  however,  the  present 
results  make  it  clear  that  syntactic  structure  does  influence 
the  processing  of  complex  transient  patterns. 

Furthermore,  it  is  obvious  from  Experiments  2  and  3  that 
the  effects  of  pattern  syntax  cannot  be  considered  in  isolation. 
Rather,  syntactic  and  semantic  factors  interact  in  an  important 
way  to  determine  categorization  performance.  For  example,  in 
Experiment  2  listeners  categorized  uninterpretable  patterns  of 
familiar  sounds.  Although  clear  syntactic  effects  were  observed 
in  this  experiment,  the  effect  of  pattern  structure  was 
considerably  smaller  than  the  corresponding  effect  in  Experiment 
1.  This  suggests  that  the  listeners'  semantic  knowledge  (i.e., 
their  familiarity  with  the  pattern  components)  actually 
interfered  with  their  ability  to  abstract  the  pattern  structure. 
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The  importance  of  semantic  factors  in  aural  classification 
was  even  more  obvious  in  the  third  experiment.  When  listeners 
were  given  explicit  descriptive  information  about  the  pattern 
components  in  their  instructions,  performance  actually  improved 
for  interpretable  patterns  but  was  slightly  degraded  for 
uninterpretable  patterns.  Although  Cole  and  Jakimik  (1978)  have 
demonstrated  that  the  theme  or  title  of  a  story  influences  the 
linguistic  processing  of  specific  words,  the  strength  of  the 
present  effect  with  nonspeech  patterns  is  somewhat  surprising. 

One  explanation  of  the  effect  is  based  on  a  relatively 
simple  labeling  strategy.  It  is  possible  that  the  descriptive 
instructions  we  provided  equipped  the  listeners  with  a 
consistent  set  of  labels  for  the  pattern  components.  As  a 
result,  these  listeners  could  employ  more  effective  encoding  and 
chunking  strategies  to  facilitate  the  learning  of 
pattern/ category  pairs  in  a  paired  associate  fashion.  While  it 
appears  likely  that  labeling  differences  of  this  sort  played  a 
role  in  Experiment  3,  it  is  apparent  that  the  semantic 
instruction  effect  cannot  be  attributed  exclusively  to  labeling. 
In  particular,  this  explanation  cannot  account  for  the  absence 
of  labeling  facilitation  for  the  uninterpretable  patterns  (the 
No ngrammati cal /Semantic  condition) . 

A  more  compelling  explanation  proposes  that  listeners  are 
influenced  by  existing  semantic  structures  when  perceiving 
patterns  of  familiar  complex  sounds.  These  existing  structures 
have  been  referred  to  as  frames  (Minsky,  1975)  or  scripts 
(Schank  &  Abelson,  1977).  In  Minsky's  view,  a  frame  is  simply  a 
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"data-structure  for  representing  a  stereotyped  situation"  (1975, 
p.  212).  We  propose  that  most  individuals  have  frames  for  a 
wide  range  of  possible  acoustic  transient  patterns.  Each  frame 
represents  the  source  events  for  a  particular  pattern.  When  an 
initial  sound  occurs,  the  listener  refers  to  the  likely  source 
scenarios — the  frames — which  contain  the  sound  as  a  beginning 
component.  In  other  words,  the  listener  constructs  hypotheses 
about  what  the  entire  pattern  will  be,  based  on  partial 
perceptual  information  and  his  or  her  existing  knowledge.  As 
successive  transients  are  heard  and  interpreted,  inappropriate 
frames  can  be  eliminated  until,  ultimately,  enough  information 
is  accumulated  for  the  pattern  to  be  associated  with  an 
appropriate  source  scenario.  In  this  view,  the  interpretation 
of  complex  transient  patterns  results  from  an  interplay  of 
bottom-up  and  top-down  processes. 

In  such  a  system,  explicit  semantic  instructions  would  not 
only  provide  the  listener  with  a  set  of  component  labels,  but 
with  a  set  of  possible  frames  as  well.  In  Experiment  3,  the 
instructed  listeners  would  attempt  to  relate  the  patterns  they 
heard  to  familiar  scenarios  involving  steam  and  water  flow. 
These,  frames  or  scenarios  would  be  appropriate  in  the  case  of 
interpretable  (i.e.,  grammatical)  patterns,  but  inappropriate 
for  the  uninterpretable  patterns.  In  the  latter  case,  the 
listeners'  inability  to  interpret  the  patterns  using  the 
suggested  frames  would  prove  distracting.  In  other  words,  these 
listeners  would  be  unable  to  make  semantic  sense  out  of  the 
patterns  despite  the  fact  that  the  individual  pattern  components 
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were  familiar  and  consistent  with  the  labels  we  provided.  The 
fact  that  an  explicit  semantic  context  led  to  degraded 
peformance  with  uninterpretable  patterns  suggests  that  it  may  be 
very  difficult  to  ignore  a  pattern's  meaning  even  though  it 
would  be  advantageous  to  do  so.  On  the  other  hand,  when  no 
semantic  information  was  provided  explicitly,  the  listeners  may 
have  simply  constructed  their  own  appropriate  labels  and  frames 
for  the  patterns. 

In  conclusion,  we  have  argued  that  many  complex  sound 
patterns  have  both  syntactic  and  semantic  structure  which  is 
determined  by  the  sequence  of  source  events  which  produce  them. 
In  interpreting  such  patterns,  human  listeners  rely  on  their 
knowledge  of  these  factors  as  well  as  on  the  perceptual 
information  available  in  the  sound  itself.  Most  theorists  agree 
that  this  occurs  in  the  processing  of  linguistic  information, 
and  current  research  is  underscoring  the  importance  of  syntactic 
and  semantic  factors  in  the  perception  of  complex  visual  scenes 
(Biederman,  1980).  Despite  this,  however,  the  role  of  these 
factors  in  the  classification  of  nonlinguistic  acoustic  patterns 
has  not  been  demonstrated  previously.  In  the  present  study  we 
have  shown  that  these  factors  can  play  a  significant  role  in 
even  relatively  simple  classification  tasks.  Additional  work  is 
needed  to  elaborate  their  effects  and  to  determine  the  influence 
of  pattern  structure  on  more  traditional  psychoacoustic  measures 
such  as  the  listener's  ability  to  resolve  individual  pattern 
components  (Watson  &  Kelly,  in  press). 
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